Experimental Analysis of Algorithms

نویسنده

  • Catherine C. McGeoch
چکیده

and the physical. If you want to understand the fundamental and universal properties of a computational process, study the abstract algorithm and prove a theorem about it. If you want to know how the process really works, implement the algorithm as a program and measure the running time (or another quantity of interest). This distinction between the abstract model and the physical artifact exists in the study of computational processes just as in every other area of mathematical modeling. But algorithmic problems have some unusual features. For example, we usually build models to serve as handy representations of natural phenomena that cannot be observed or manipulated directly. But programs and computers are completely accessible to the researcher and are far more manipulable than, say, weather patterns. They are also much easier to understand: hypothetically, one could obtain complete information about the behavior of a program by consulting technical documents and code. And finally, algorithms are usually invented before programs are implemented, not the other way around. This article surveys problems and opportunities that lie in the interface between theory and practice in a relatively new research area that has been called experimental algorithmics, experimental analysis of algorithms, or algorithm engineering. Much work in this field is directed at finding and evaluating state-of-the-art implementations for Catherine C. McGeoch is an associate professor of computer science at Amherst College. She has been active in the development of an experimental tradition in algorithmic studies. Her e-mail address is [email protected]. given applications. Another effort focuses on using experiments to extend and improve the kinds of results obtained in traditional algorithm analysis. That is, rather than having a goal of measuring programs, we develop experiments in order to better understand algorithms, which are abstract mathematical objects. In this article we concentrate on examples from this latter type of research in experimental algorithmics. It is natural to wonder whether such an effort is likely to bear fruit. If the ultimate goal of algorithm analysis is to produce better programs, wouldn't we be better off studying programs in their natural habitats (computers) rather than performing experiments on their abstractions? Conversely, if the goal is to advance knowledge in an area of mathematical research (algorithm analysis), are we wise to study abstract objects using imperfect, finite, and ever-changing measurement tools? One answer to the first question is that reliable program measurement is not as easy as it sounds. Running times, for example, depend on complex interactions between a variety of products that comprise the programming environment, including the computer chip (perhaps Intel Inside), memory sizes and configurations, the operating system (such as Windows 98 or Unix), the programming language (maybe Java or C), and the brand of compiler (like CodeWarrior). 1 These products are sophisticated, varied in design, and, especially when l intel InsideTM is a registered trademark of the Intel Corporation. WindowsTM 98 is a registered trademark of Microsoft Corporation. UnixTM is a registered trademark of The Open Group. JavaTM is a registered trademark of Sun Microsystems, Inc. C is not a trademark. CodeWarriorTM is a trademark of Metrowerks, Inc. 304 NOTICES OF THE AMS VOLUME 48, NUMBER 3 1. Array A[1... n] contains n distinct numbers. We want the rth smallest. Set lo — 1 and hi — n. 2. Partition. Set x A[lo]. Rearrange the numbers in A[lo . . . hi] into three groups around some index p such that: a. For lo i < p, we have A[i] < x. b. A[p] = x. c. For p < j hi, we have A[j] > x. 3. Check Indices. If p = r, stop and report the element A[p]. Otherwise, if p < r, set lo p+ 1 and go to Step 2. Otherwise, p > r, so set hi — p — 1 and go to Step 2. Figure 1: Selection Algorithm S. The algorithm reports the rth -smallest number from array A[1 n used in combination, extremely difficult to model. There is no known general method for making accurate predictions of performance in one programming environment based on observations of running times in another. Experimental analysis of the abstraction allows us to have more control over the trade-off between generality and accuracy when making predictions about performance. An answer to the second question is straight from the mathematician: algorithms and their analyses are beautiful and fundamental, and they deserve study by any means available, including experimentation. Certainly algorithms existed long before the first computing machine was a gleam in Charles Babbage's eye. For example, Euclid's Elements, circa 300 B.C., contains an algorithm for finding the greatest common divisor of two numbers. While many algorithms have been discovered and published over the centuries, it is only with the advent of computers that the notion of analyzing algorithmic efficiency has been formalized. Perhaps the first real surprise in algorithm analysis occurred with Strassen's discovery in 1968 of a new method for multiplying two n x n matrices. While the classic algorithm we all learned in high school requires 2n 3 — n 2 scalar arithmetic operations, Strassen's algorithm uses fewer than 7n log2 7 6n 2 operations, where log2 7 < 2.808. Therefore, this new algorithm uses fewer operations than the classic method when n is greater than 654. Strassen's discovery touched off an intensive search for asymptotically better matrix multiplication algorithms. The current champion requires no more than cn 2 ' 376 scalar operations for a known constant c. It is an open question whether better algorithms exist. (See any algorithms textbook, such as [1], for more about matrix multiplication.) These fancy algorithms are not much use in practice, however. It would be a daunting prospect even to write an error-free program for one of them, and the extra computation costs imposed by their complexity make them unlikely to outperform the classic method in typical scenarios. Algorithm analysis is a vigorous and vital subdiscipline of computer science, providing deep new insights into the fundamental power and limitations of computation, fodder for the development of new mathematical techniques (mostly combinatorial), and, not infrequently, efficient algorithms that can have substantial impact on practice. It is clear, however, that our analytical techniques are far too weak to answer all our questions about algorithms. Computational experiments are being used to suggest new directions for research, to support or disprove conjectures, and to guide the development of new analytical tools and proof strategies. This article presents examples from two broad research efforts in experimental algorithmics: first, to develop accurate models of computation that allow closer predictions of performance, and second, to extend abstract analyses beyond traditional questions and assumptions. We start with a short tutorial on the notations and typical results obtained in algorithm analysis. A Tutorial on Algorithm Analysis The selection problem is to report the rth-smallest number in a collection of n numbers. For example, r = 1 refers to the minimum of the collection, and r = (n + 1)/2 is the median when n is odd. For convenience we assume that no duplicate numbers appear in the collection. Figure 1 presents a well-known selection algorithm. The n numbers are placed in an array called A in no particular order. Each number has some position from 1 to n in the array: the notation A[i] refers to the number in position i, and i is called an index. The notation -e lo means "set e equal to the value of lo". The main operation of algorithm S is to choose a partition element x from a contiguous subarray of A defined by indices lo and hi and to partition the subarray by rearranging its contents so that numbers smaller than x are to its left and numbers larger are to its right. This puts x at some location p; that is, A[p] = x. After partitioning, we know that x is the pth-smallest number in A. We are looking for the rth-smallest number: depending on the relationship of p to r, the algorithm either stops or repeats the process on one side or the other of p. To analyze the algorithm, we derive a function that relates input size to the cost of the computational resources used by the algorithm. The precise meanings of "input size" and "cost" depend MARCH 2001 NOTICES OF THE AMS 305 upon the model of computation being assumed. Here we shall use the simple RAM (random access machine) model under which all scalar numbers have unit size and all basic operations on scalar values—such as arithmetic, comparison, and copying of values—have unit cost. Therefore the input size is n. For our purposes it is sufficient to define cost as the number of times x must be compared to elements from A[lo . . . hi] during the partitioning step. A partitioning method is known (described later in this article) that uses hi lo comparisons to partition subarray A[lo . . . hi] of size hi lo + 1. The only problem remaining is to count up total costs. Obviously, the total cost depends on which partition element x = A[lo] is used each time: we may get lucky and find p = r after just one partitioning operation, or we may have to repeat the process several times. The worst-case cost, Cw(n), is the maximal number of comparisons over all arrays of size n and all r: a worst-case scenario holds, for example, when r = n and p becomes 1, 2, ... , n in successive partitioning stages. Letting t denote the cost of a single comparison of x to an array element, we have Cw(n) = t(n i). The average-case cost, Ca(n, r), is found by averaging over some probability distribution on arrays of size n. Assume here that every number in the collection is equally likely to be in position lo and selected as the partition element x. Then we have:

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison the Sensitivity Analysis and Conjugate Gradient algorithms for Optimization of Opening and Closing Angles of Valves to Reduce Fuel Consumption in XU7/L3 Engine

In this study it has been tried, to compare results and convergence rate of sensitivity analysis and conjugate gradient algorithms to reduce fuel consumption and increasing engine performance by optimizing the timing of opening and closing valves in XU7/L3 engine. In this study, considering the strength and accuracy of simulation GT-POWER software in researches on the internal combustion engine...

متن کامل

A Novel Experimental Analysis of the Minimum Cost Flow Problem

In the GA approach the parameters that influence its performance include population size, crossover rate and mutation rate. Genetic algorithms are suitable for traversing large search spaces since they can do this relatively fast and because the mutation operator diverts the method away from local optima, which will tend to become more common as the search space increases in size. GA’s are base...

متن کامل

Impact of linear dimensionality reduction methods on the performance of anomaly detection algorithms in hyperspectral images

Anomaly Detection (AD) has recently become an important application of hyperspectral images analysis. The goal of these algorithms is to find the objects in the image scene which are anomalous in comparison to their surrounding background. One way to improve the performance and runtime of these algorithms is to use Dimensionality Reduction (DR) techniques. This paper evaluates the effect of thr...

متن کامل

Damped DQE Model Updating of a Three-Story Frame Using Experimental Data

In this paper, following a two-stage methodology, the differential quadrature element (DQE) model of a three-story frame structure is updated for the vibration analysis. In the first stage, the mass and stiffness matrices are updated using the experimental natural frequencies. Then, having the updated mass and stiffness matrices, the structural damping matrix is updated to minimize the error be...

متن کامل

Application of Genetic Algorithms for Pixel Selection in MIA-QSAR Studies on Anti-HIV HEPT Analogues for New Design Derivatives

Quantitative structure-activity relationship (QSAR) analysis has been carried out with a series of 107 anti-HIV HEPT compounds with antiviral activity, which was performed by chemometrics methods. Bi-dimensional images were used to calculate some pixels and multivariate image analysis was applied to QSAR modelling of the anti-HIV potential of HEPT analogues by means of multivariate calibration,...

متن کامل

Application of Genetic Algorithms for Pixel Selection in MIA-QSAR Studies on Anti-HIV HEPT Analogues for New Design Derivatives

Quantitative structure-activity relationship (QSAR) analysis has been carried out with a series of 107 anti-HIV HEPT compounds with antiviral activity, which was performed by chemometrics methods. Bi-dimensional images were used to calculate some pixels and multivariate image analysis was applied to QSAR modelling of the anti-HIV potential of HEPT analogues by means of multivariate calibration,...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006